Genes and proteins for the biosynthesis of the lantibiotic 107891

ABSTRACT

The invention relates to the field of lantibiotics, and more specifically to the isolation of nucleic acid molecules that code for the enzymes required for the biosynthetic pathway of the lantibiotic 107891 and the homologues thereof.

STATEMENT OF RELATED APPLICATIONS

This application is a U.S. National Stage Patent Application, under 35 U.S.C. §371, of International Application No. PCT/IB2007/002270, filed Aug. 3, 2007, which is incorporated herein by reference in its entirety.

SUMMARY

The present invention relates to the field of lantibiotics, and more specifically to the isolation of nucleic acid molecules that code for the enzymes required for the biosynthetic pathway of the lantibiotic 107891 and the homologues thereof. Disclosed are the functions of the gene products involved in 107891 production. The present invention provides novel biosynthetic genes necessary for 107891 production, the encoded polypeptides, the recombinant vectors comprising the nucleic acid sequences that encode said polypeptides, the host cells transformed with said vectors and methods for producing lantibiotics using said transformed host cells, including methods for producing 107891, a precursor thereof, a derivative thereof or a modified lantibiotic different from 107891 or a precursor thereof.

BACKGROUND OF THE INVENTION

The continuous increase of pathogenic bacteria resistant to the existing antibiotics is a health global concern, and so there is a pressing need to discover and develop new compound that are active against resistant bacteria. This has led to renewed interest in natural products which have been in the past a rich source of antibiotics such as penicillins, macrolides and glycopeptides. Attractive candidates are also antimicrobial peptides and among them those designated as lantibiotics, i.e. lanthionine-containing antibiotic. Lantibiotics form a particular group within the antimicrobial peptides and are distinguished by several features such as primary and spatial structure characteristics, unique biosynthetic pathways and peptide modification reactions and potent antibacterial activity. These are a group of peptide-derived antimicrobial compounds secreted by Gram-positive bacteria and primarily act on Gram-positive bacteria. Lantibiotics are ribosomally synthesised as prepropeptides which are posttranslationally modified to their biologically active forms. The prepeptide consists of an N-terminal leader sequence, that does not undergo any post-translational modification and is cleaved off during or after secretion from the cell, and a C-terminal region (the propeptide), which is post-translationally modified. Lantibiotics are produced by different bacteria: the common feature of these compounds is the presence of one or more lanthionine residues, which consist of two alanine residues covalently cross-linked by a thioether linkage The thioether brigde is formed when a cysteine residue reacts with a dehydroalanine or dehydrobutyrine moiety to form a lanthionine or methyllantionine residue, respectively. The dehydroamino acid residues are in turn formed by dehydration of serine and threonine, respectively. Based on their structural and functional properties, lantibiotics are usually divided in two groups, type-A and type-B. Type-A lantibiotics are elongated, cationic peptides varying in length from 20 to 34 amino acids residues: raisin, subtilin, epidermin and Pep5 are members of this group. Type-B are globular peptides with a net negative charge: examples of this group of lantibiotics are mersacidin, cinnamycin, lacticin 481 and actagardine. These structural differences reflect on the mechanism of action. Type-A compounds exert their antimicrobial activity by blocking cell wall biosynthesis and by forming pores in the cellular membranes, through a mechanism that may or may not be aided by prior docking on the cellular target lipid II. Type-B lantibiotics also exert their antimicrobial activity by inhibiting peptidoglycan biosynthesis, but these compounds do not form pores once bound to lipid II.

Lantibiotics have been shown to have efficacy and utility as food additives and antibacterial agents. Nisin, the most studied lantibiotic, is produced by Lactococcus lactis and is active at low concentrations (low nanomolar MICs) against many Gram-positive bacteria including drug resistant strains and the food-borne pathogens Clostridium botulinum and Listeria monocytogenes. It has been extensively used as a food preservative without substantial development of bacterial resistance. Other lantibiotics show interesting biological activities: for example, epidermin shows high potency against Propionibacterium acnes; cinnamycin and duramycin inhibit phospholipase A₂ and angiotensin converting enzyme, providing potential applications as anti-inflammatory agents and for blood pressure regulation, respectively; and mersacidin inhibits many Gram-positive bacteria, including methicillin-resistant Staphylococcus aureus (MRSA).

The genes responsible for the biosynthesis of lantibiotics are organized in clusters designated by the locus symbol Ian, with a more specific genotypic designation for each lantibiotic (e.g., nis for nisin, gdin for gallidermin, cin for cinnamycin). Many Ian genes have been sequenced demonstrating a high level of similarity in gene organization. Each cluster includes: the structural gene lanA encoding the prepeptide; and the gene(s) required for the dehydration of Ser and Thr residues in the propeptide portion of LanA and for the thioether formation. For Type-A lantibiotics, LanB carries out the dehydration reactions and LanC is devoted to thioether bridging, whereas in Type-B lantibiotics a single LanM enzyme catalyzes both reactions. Additional genes are usually present in lantibiotic clusters: lanT encodes the ABC transporter for secretion of the lantibiotic, often in combination with a second transport system, encoded by the lanEFG genes; lanP encodes the processing protease, but some clusters lack such a gene which may be part of lanT or its function may be provided by a cellular protease; and lanI encodes a protein involved in self-protection; and lanKR are responsible for regulating expression of the lan genes.

There exists the potential and the utility to obtain improved lantibiotics by manipulation of occurring natural compounds. However, lantibiotics are structurally complex peptides and their accessibility to chemistry is limited to a few positions in the molecule. One of the major limitations for chemistry is to change the type or order of amino acids present in the peptide backbone. In light of the above, it would be desirable to have genes and enzymes useful for redirecting these steps in lantibiotic formation, in order to obtain derivatives that are hard or impossible to make by chemical means. General methods for the design of novel lantibiotic derivatives directly by fermentation processes with precisely engineered strains would thus be highly desirable.

In fact unusual aminoacids in lantibiotics solely contribute to their biological activity and also enhance their structural stability. Enzymes involved in lantibiotic biosynthesis represent a high potential for peptide engineering by introducing unusual aminoacids into desired peptides.

The lantibiotic 107891 shows antibacterial activity against Gram-positive bacteria including methycillin- and vancomycin-resistant strains but shows limited activity against Gram-negative bacteria (for examples, some Moraxella, Neisseria and Haemophilus spp.). 107891 was isolated from fermentation of Microbispora sp. PTA-5024 (WO 2005/04628 A1). It consists of a complex of closely related factors A1 and A2, whose structure can be reconducted to a peptide skeleton, 24 amino acids long, containing lanthionine and methyllanthionine as constituents. In addition, a chlorine atom and one or two —OH residues are present on the molecule. The structure of the components of the 107891 complex is represented by the formula 1 of FIG. 3, where R represents [OH] with the factors A₁(R═OH), factor A₂ (R═—(OH)₂). 107891 appears to combine elements of Type-A and -B lantibiotics: rings A and B in 107891 are highly related to the equivalent rings in Type-A compounds; however, as in Type-B lantibiotics, 107891 is rather globular, it lacks a flexible C-terminal tails and is devoid of charged amino acid residues. Consequently, it cannot be predicted whether the Zan cluster devoted to 107891 formation would encode a single LanM enzyme or separate LanB and LanC proteins. Furthermore, there are no precedent for chlorine-containing lantibiotics, thus the genes responsible for this post-translational modification cannot be predicted from available data.

The design of industrial processes for antibiotic production has been relatively successful, resulting in large size fermentations with antibiotic titers reaching levels of several grams per liter. This has been achieved largely by following empirical, trial and error approaches, and lacks a rational basis. Development of new processes and improvement of current technology thus remains time consuming and may result in bacterial cultures that are unstable, perform inconsistently and accumulate unwanted by-products. In recent years, rational methods have been applied successfully to increase the level of antibiotic produced by Streptomyces spp., which have often involved the manipulation of key regulatory elements present within the gene cluster of interest or the overexpression of rate-limiting steps in the pathway. Therefore, the genes encoding such cluster-associated regulators or limiting steps in the synthesis can be effective tools for yield improvement. However, the cluster-associated regulators so far identified in actinomycetes belong to several different protein families. Even within one family, there is considerable variation in sequence identity. Therefore, the existence, nature, number and sequence of cluster-associated regulators cannot be predicted by comparison to other cluster, even those specifying a related antibiotic. As an example, the tylosin gene cluster encodes four distinct regulators, while none has been found in the cluster specifying the related macrolide antibiotic erythromycin. Similarly, the nature and reason for a rate-limiting step in a biosynthetic pathway cannot be established a priori.

Therefore, tools for increasing the 107891 yield would be highly desirable. However, there are no examples of clusters from other members of the genus Microbispora. Therefore, the mechanism(s) cannot be predicted through which the producer strain protects itself from the action of 107891, governs the expression of the other lan genes, or coordinates expression of lan genes with its other cellular processes. Information about these will be very be useful for optimizing the production process.

DESCRIPTION OF THE INVENTION

The present invention provides a set of isolated polynucleotide molecules required for the biosynthesis of the lantibiotic 107891 in microorganisms.

So, according to one of its aspect, the present invention relates to polynucleotide molecules which are selected from the contiguous DNA sequence (SEQ ID NO: 1), which represents the mlb gene cluster as isolated from Microbispora sp. PTA-5024 and consists of 17 ORFs encoding the polypeptides required for 107891 formation.

The amino acid sequences of the polypeptide encoded by said 17 ORFs are provided in SEQ ID NOS: 2 to 18.

The present invention also provides an isolated nucleic acid comprising a nucleotide sequence selected from the group consisting of:

-   -   a) the mlb gene cluster encoding the polypeptides required for         the synthesis of 107891 and homologues thereof (SEQ ID NO: 1);     -   b) a nucleotide sequence encoding the same polypeptides encoded         by the mlb gene cluster (SEQ ID NO. 1), other than the         nucleotide sequence of the mlb gene cluster itself;     -   c) any nucleotide sequence of mlb ORFs 1 to 17, encoding the         polypeptides encoded by SEQ ID NOS: 2 to 18;     -   d) a nucleotide sequence encoding the same polypeptide encoded         by any of mlb ORFs 1 to 17 (SEQ ID NOS: 2 to 18), other than the         nucleotide sequence of said ORF.

A further subject matter of this invention is to provide an isolated nucleic acid comprising a nucleotide sequence selected from the group consisting of

-   -   e) a nucleotide sequence encoding a polypeptide that is, over         its full length, at least 65%, preferably 86%, more preferably         90%, most preferably 95% or more, identical in amino acid         sequence to a polypeptide encoded by any of mlb ORFs 1 to 17         (SEQ ID NOS: 2 to 18).

In one embodiment the isolated nucleic acids of this invention comprises an ORF selected from ORFs 1 to 17 (SEQ ID NOS: 2 to 18), which encodes a polypeptide required for the synthesis of the prepropetide of 107891.

In another embodiment, the nucleic acid comprises an ORF selected from ORFs 1 to 17 (SEQ ID NOS: 2 to 18), which encodes the polypeptide required for the synthesis of the 107891 dehydratase enzyme. In yet another embodiment, the nucleic acid comprises an ORF selected from ORFs 1 to 17 (SEQ ID NOS: 2 to 18), which encodes the polypeptide required for the synthesis of the lanthionine and methyl-lanthionine residues of 107891.

According to another embodiment, in a nucleic acid of this invention, an ORF selected from ORFs 1 to 17 (SEQ ID NOS: 2 to 18) is provided, which encodes a polypeptide required for the chlorination of the triptophan residue of amino acid 4 of 107891.

In yet another embodiment, nucleic acid comprising an ORF selected from ORFs 1 to 17 (SEQ ID NOS: 2 to 18) is provided, which encodes a polypeptide required for the hydroxylation of the proline residue of aminoacid 14 of 107891.

In yet another embodiment, nucleic acid comprising an ORF selected from ORFs 1 to 17 (SEQ ID NOS: 2 to 18) is provided, which encodes a flavoprotein required for the oxidative decarboxylation that yields the S-[(Z)-2-aminovinyl]-(3S)-3-methyl-D-cysteine (AviMeCys) residues present at positions 21 and 24 of 107891.

According to another embodiment, in the nucleic acid of this invention, an ORF selected from ORFs 1 to 17 (SEQ ID NOS: 2 to 18) is provided which encodes the polypeptide required for the reduction of the flavoprotein.

According to yet another embodiment, nucleic acids are provided which comprise combinations of ORFs selected from ORFs 1 to 17 (SEQ ID NOS: 2 to 18), encoding polypeptides required for the export of and resistance to 107891.

In yet another embodiment, nucleic acids are provided which comprise combinations of ORFs selected from ORFs 1 to 17 (SEQ ID NOS: 2 to 18), encoding polypeptides required for regulating the expression of the m/b gene cluster.

Those skilled in the art understand that the present invention, having provided the nucleotide sequences encoding polypeptides of the 107891 biosynthetic pathway, also provides nucleotides encoding fragments derived from such polypeptides. According to the invention, those skilled in the art understand that, since the genetic code is degenerate, the same polypeptides encoded by SEQ ID NOS: 2 to 18 can be encoded by natural or artificial variants of ORFs 1 to 17, i.e. by nucleotide sequences other than the genomic nucleotide sequences specified by ORFs 1 to 17 but which encode the same polypeptides.

Furthermore, it is also understood that naturally occurring or artificially manufactured variants can occur of the polypeptides encoded by SEQ ID NOS: 2 to 18, said variants having the same function(s) as the above mentioned original polypeptides but containing addition, deletion or substitution of amino acid not essential for folding or catalytic function, or conservative substitution of essential amino acids.

Those skilled in the art understand also that, having provided the nucleotide sequence of the entire cluster required for 107891 biosynthesis, the present invention also provides nucleotide sequences required for the expression of the genes present in said cluster. Such regulatory sequences include but are not limited to promoter and enhancer sequences, antisense sequences, transcription terminator and antiterminator sequences. These sequences are useful for regulating the expression of the genes present in the mlb gene cluster. Cells carrying said nucleotide sequences, alone or fused to other nucleotide sequences, fall also within the scope of the present invention.

In one aspect, the present invention provides isolated nucleic acids comprising nucleotide sequences encoding the ORF6 polypeptide (SEQ ID NO: 7), or naturally occurring variants or derivatives of said polypeptide, encoding the prepropeptide of 107891.

In another aspect, the present invention provides nucleic acids comprising nucleotide sequences encoding the ORF7 polypeptide (SEQ ID NO: 8), or naturally occurring variants or derivatives of said polypeptide, useful for the dehydratation of serine and threonine residues in the lantibiotic precursor.

In yet another aspect, the present invention provides a nucleic acid comprising nucleotide sequences encoding the ORF8 polypeptide (SEQ ID NO: 9), or naturally occurring variants or derivatives of said polypeptide, useful for lanthionine and methyl-lanthionine formation in the lantibiotic precursor.

In another aspect, the present invention provides nucleic acids comprising nucleotide sequences encoding the ORF9 polypeptide (SEQ ID NO: 10), or naturally occurring variants or derivatives of said polypeptide, useful for AviMeCys formation in the antibiotic precursor. In another aspect, the present invention provides nucleic acids comprising nucleotide sequences encoding the ORF15 polypeptide (SEQ ID NO: 16), or naturally occurring variants or derivatives of said polypeptide, useful for the chlorinating the tryptophan residue in the lantibiotic precursor.

In yet another aspect, the present invention provides nucleic acids comprising nucleotide sequences encoding the ORF2 polypeptide (SEQ ID NO: 3), or naturally occurring variants or derivatives of said polypeptide, useful for hydroxylating the proline residue in the lantibiotic precursor.

In another aspect, the present invention provides nucleic acids comprising nucleotide sequences encoding the polypeptides specified by ORFs 1, 10 to 11, 13 to 14 and 17 (SEQ ID NOS: 2, 11 to 12, 14 to 15, and 18), or naturally or artificially occurring variants or derivatives of said polypeptides, useful for export out of the cells of 107891 or a 107891 precursor.

In another aspect, the present invention provides nucleic acids comprising nucleotide sequences encoding the ORFs 3 and 5 polypeptides (SEQ ID NO: 4 and 6), or naturally or artificially occurring variants or derivatives of said polypeptides, useful for regulating lantibiotic production.

In one embodiment, the present invention provides a lantibiotic-producing strain carrying extra copies of the nucleotide sequences specifying at least one ORF selected from any of ORFs 1 through 17 (SEQ ID NOS: 2 to 18).

In one preferred embodiment, such lantibiotic-producing strain is any strain belonging to the order Actinomycetales.

In yet another preferred embodiment, such lantibiotic producing strain is a member of the genus Microbispora.

In one preferred embodiment, the present invention provides a Microbispora strain containing one or more variations in the nucleotide sequence specified in SEQ ID NO: 1, such variation resulting in an increased or decreased expression of one or more of ORFs 1 through 17 (SEQ ID NOS: 2 to 18).

In one preferred embodiment, the present invention provides nucleic acids comprising a nucleotide sequence specified by SEQ ID NO: 1, or a portion thereof, carried on one or more vectors, useful for the production of 107891, one or more of its precursors or a derivative thereof by another cell.

In one preferred embodiment, said nucleotide sequence or portion thereof is carried on a single vector. Suitable vector are any cosmid, fosmid, BAC, PAC, ESAC vector capable of carrying the entire mlb cluster as defined herein. Suitable vectors are well known to those skilled in the art and described in the literature (Kieser et al., 2000, and references described therein).

In one aspect, the present invention provides a method for increasing the production of 107891, said method comprising the following steps:

-   -   transforming with a recombinant DNA vector a microorganism that         produces 107891 or homologues thereof or precursors of 107891 or         homologues thereof by means of a biosynthetic pathway, said         vector comprising a DNA sequence, chosen from any of ORFs 1         through 17 (SEQ ID NO: 2 through 18), that codes for an activity         that is rate limiting in said pathway;     -   culturing said microorganism transformed with said vector under         conditions suitable for cell growth, expressing said gene and         producing said antibiotic or antibiotic precursor.

Suitable host cell are Actinomycetales, to the families Streptosporangiaceae, Micromonosporaceae, Pseudonocardiaceae and Streptomycetaceae, to the genera Microbispora, Actinoplanes, Planomonospora, Streptomyces and the like.

In another aspect, the present invention provides a method for producing derivatives of 107891 and homologues thereof, said method comprising the following steps:

-   -   cloning in a suitable vector a segment chosen from the         nucleotide sequence defined by SEQ ID NO:1, said segment         containing at least a portion of one of ORFs 1 through 17 (SEQ         ID NO: 2 through 18), said ORF encoding a polypeptide that         catalyzes a biosynthetic step that one wishes to bypass;     -   inactivating said ORF by removing or replacing one or more         codons that specify for amino acids that are essential for the         activity of said polypeptide;     -   transforming with said recombinant DNA vector a microorganism         that produces 107891 or homologues thereof or 107891 precursor         thereof by means of a biosynthetic pathway;     -   screening the resulting transformants for those where said DNA         sequence has been replaced by the mutated copy;     -   culturing mutant cells under conditions suitable for cell         growth, expressing of said pathway and producing of said pathway         analogue.

In yet another aspect, the present invention provides a method for producing novel lantibiotics, said method comprising the following steps:

-   -   transforming with a recombinant DNA vector a microorganism that         produces a lantibiotic or homologues thereof or precursor         thereof by means of a biosynthetic pathway, said vector         comprising one or more ORFs, chosen among ORFs 1 through 17 (SEQ         ID NOS: 2 through 18), that codes for enzyme(s) capable of         modifying said lantibiotic or lantibiotic precursor;     -   culturing said microorganism transformed with said vector under         conditions suitable for cell growth, expression of said gene and         production of said lantibiotic or homologues thereof or         lantibiotic precursor thereof.

In yet another aspect, the present invention provides a method for producing novel lantibiotics, said method comprising the following steps:

-   -   transforming with a recombinant DNA vector a microorganism, said         vector comprising one or more ORFs, chosen among ORFs 1 through         17 (SEQ ID NOS: 2 through 18), that codes for enzyme(s) capable         of modifying a lantibiotic or lantibiotic precursor;     -   culturing said microorganism transformed with said vector under         conditions suitable for cell growth, expressing of said gene, in         the presence of said lantibiotic or lantibiotic precursor.

In yet another aspect, the present invention provides a method for producing novel lantibiotics, said method comprising the following steps:

-   -   transforming with a recombinant DNA vector a microorganism, said         vector comprising one or more ORFs, chosen among ORFs 1 through         17 (SEQ ID NOS: 2 through 18), that codes for one or more         polypeptides that modify a lantibiotic or lantibiotic precursor;     -   preparing a cell extract or cell fraction of said microorganism         under conditions suitable for the presence of active         polypeptide(s), said cell extract or cell fraction containing at         least said polypeptide(s);     -   adding a lantibiotic or lantibiotic precursor to said cell         extract or cell fraction, and incubating said mixture under         conditions where said polypeptide(s) can modify said lantibiotic         or lantibiotic precursor.

A further aspect of this invention includes an isolated polypeptide involved in the biosynthetic pathway of 107891 selected from

-   -   an ORF polypeptide encoded by any one of mlb ORFs 1 to 17 (SEQ         ID NOS: 2 through 18) and     -   a polypeptide which is at least, over its full length, 65%,         preferably 95% or more, identical in amino acid sequence to a         polypeptide encoded by any one of mlb ORFs 1 to 17 (SEQ ID NOS:         2 through 18), preferably by any one of the mlb ORFs 1 to 3, 5         to 11, 13 to 17 (SEQ ID NOS: 2 to 4, 6 to 12, 14 to 18).

A preferred group of polypeptides comprises any ORF polypeptide encoded by any of the mlb ORFs 1 to 3, 5 to 11, 13 to 17 (SEQ ID NOS: 2 to 4, 6 to 12, 14 to 18), or any polypeptide which is at least, over its full length, 65%, preferably 86%, more preferably 90%, most preferably 95% or more, identical in amino acid sequence to a polypeptide encoded by any of said mlb ORFs.

DEFINITIONS

The term “isolated nucleic acid” herein refers to a DNA molecule, either as genomic DNA or a complementary DNA (cDNA), which can be single or double stranded, of natural or synthetic origin. This term refers also to an RNA molecule, of natural or synthetic origin.

The term “nucleotide sequence” herein refers to full length or partial length sequences of ORFs and intergenic regions as disclosed herein.

The term “nucleotide sequence” herein is also referred to and/or comprises any one of the nucleotide sequence of the invention as show in the sequence listing. Any one of the nucleotide sequences of the sequence listing is

-   -   A) a coding sequence,     -   B) an RNA molecule derived from transcription of (A),     -   C) a coding sequence which uses the degeneracy of the genetic         code to encode an identical polypeptide,     -   D) an intergenic region, containing promoters, enhancers,         terminator and antiterminator sequences.

The terms “gene cluster”, “cluster” and “biosynthesis cluster” herein designate a contiguous segment of a microorganism's genome that contains all the genes required for the synthesis of a secondary metabolite.

The term “mlb” herein refers to a genetic element responsible for 107891 biosynthesis in Microbispora sp. PTA-5024. The term is an acronym of Microbispora LantiBiotic.

The term “ORF” herein refers to a genomic nucleotide sequence that encodes one polypeptide. In the context of the present invention, the term ORF is synonymous with “gene”.

The term “ORF polypeptide” herein refers to a polypeptide encoded by an ORF.

The term “mlb ORE” herein refers to an ORF comprised within the nab gene cluster.

The term “secondary metabolite” herein refers to a bioactive substance produced by a microorganism through the expression of a set of genes specified by a gene cluster.

The term “vector” herein is defined to include, inter alia, any plasmid, cosmid, phage, which can transform prokaryotic host by integration into the cellular genome or exist extrachromosomally (e.g. autonomous replicating plasmid with an origin of replication)

The term “production host”, “host cell” herein is a microorganism where the formation of a secondary metabolite is directed by a gene cluster derived from a donor organism.

The term “homologue” herein refers to a polypeptide, encoded by a biosynthetic gene cluster, which shares at least 65% sequence identity over its entire length with any of the polypeptides encoded by the mlb cluster.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows isolated DNA segments derived from the chromosome of Microbispora sp. PTA-5024. The thick line denotes the segment described in SEQ ID NO: 1. The cosmids carrying said isolated DNA segments are designated 1G6 and 6H6.

FIG. 2 shows genetic organization of the mlb cluster. Each ORF is represented by an arrow, and numbered as in Table 1 (FIG. 4). The orientation is the same as in FIG. 1. Numbers on the scale bars indicate sequence coordinates (in kb).

FIG. 3 shows structure of the components of the 107891 complex.

FIG. 4 shows the main features of the ORFs.

A. The Mlb Genes Isolated from Microbispora

107891 is a complex of closely related peptide antibiotics produced by Microbispora sp. PTA-5024. The present invention provides nucleic acid sequences and characterization of the mlb gene cluster for 107891 biosynthesis. The physical organization of the mlb gene cluster, together with flanking DNA sequences, is reported in FIG. 1, which illustrates the physical map of a 20-kb genomic segment from the genome of Microbispora sp. PTA-5024, together with two cosmids defining such segment. The genetic organization of the DNA segment governing 107891 biosynthesis is shown in FIG. 2 and its nucleotide sequence is reported as SEQ ID NO: 1.

The precise boundary of the cluster can be established from the functions of its gene products. Therefore, on the left end (FIG. 1), the mlb cluster is delimited by mlb ORF1, encoding the ABC transporter (SEQ ID No: 2), involved in the export of 107891. On the right side, the mlb cluster is delimited by mlb ORF17, a membrane ion antiporter (SEQ ID No: 18). The mlb cluster spans approximately 20,000 base pairs and contains 17 ORFs, designated mlb ORF1 through mlb ORF17. The contiguous nucleotide sequence of SEQ ID NO: 1 (20000 base pairs) encodes the 17 deduced proteins listed in SEQ ID NOS: 2 to 18.

-   -   ORF1 (SEQ ID NO: 2) represents 300 amino acids deduced from         translating SEQ ID NO: 1 from nucleotides 67 to 969.     -   ORF2 (SEQ ID NO: 3) represents 414 amino acids deduced from         translating SEQ ID NO: 1 from nucleotides 966 to 2210.     -   ORF3 (SEQ ID NO: 4) represents 260 amino acids deduced from         translating SEQ ID NO: 1 from nucleotides 2941 to 3723.     -   ORF4 (SEQ ID NO: 5) represents 221 amino acids deduced from         translating SEQ ID NO: 1 from nucleotides 3948 to 4614.     -   ORF5 (SEQ ID NO: 6) represents 220 amino acids deduced from         translating SEQ ID NO: 1 from nucleotides 5283 to 4621 on the         complementary strand.     -   ORF6 (SEQ ID NO: 7) represents 57 amino acids deduced from         translating SEQ ID NO: 1 from nucleotides 5414 to 5587.     -   ORF7 (SEQ ID NO: 8) represents 1115 amino acids deduced from         translating SEQ ID NO: 1 from nucleotides 5706 to 9053.     -   ORF8 (SEQ ID NO: 9) represents 475 amino acids deduced from         translating SEQ ID NO: 1 from nucleotides 9080 to 10507.     -   ORF9 (SEQ ID NO: 10) represents 215 amino acids deduced from         translating SEQ ID NO: 1 from nucleotides 10537 to 11184.     -   ORF10 (SEQ JD NO: 11) represents 316 amino acids deduced from         translating SEQ ID NO: 1 from nucleotides 11181 to 12131.     -   ORF11 (SEQ ID NO: 12) represents 242 amino acids deduced from         translating SEQ ID NO: 1 from nucleotides 12253 to 12981,     -   ORF12 (SEQ ID NO: 13) represents 211 amino acids deduced from         translating SEQ ID NO: 1 from nucleotides 13357 to 13992.     -   ORF13 (SEQ ID NO: 14) represents 249 amino acids deduced from         translating SEQ ID NO: 1 from nucleotides 14795 to 15544.     -   ORF14 (SEQ ID NO: 15) represents 236 amino acids deduced from         translating SEQ ID NO: 1 from nucleotides 15546 to 16256.     -   ORF15 (SEQ ID NO: 16) represents 541 amino acids deduced from         translating SEQ ID NO: 1 from nucleotides 16370 to 17995.     -   ORF16 (SEQ ID NO: 17) represents 178 amino acids deduced from         translating SEQ ID NO: 1 from nucleotides 17992 to 18528.     -   ORF17 (SEQ ID NO: 18) represents 430 amino acids deduced from         translating SEQ ID NO: 1 from nucleotides 18525 to 19817.         The Genomic Organization and Primary Sequence of the Mlb Cluster         Places 107891 in Type-A Lantibiotics.

Comparison between mlb and other lantibiotic gene clusters reveals important differences. In fact, the mlb cluster is characterized by the presence of several ORFs that do not find homologs in other lantibiotic clusters. These include mlb ORFs 1 through 6, 12, 15 through 18 (SEQ ID NOS: 2 through 7, 13, 16 through 19). In conclusion, the organization of the mlb cluster as described herein is substantially different from those of other clusters involved in the synthesis of other lantibiotic. It therefore represents the first example of a cluster with such a genomic organization.

B. Roles of the mlb genes

The present invention discloses the DNA sequence responsible for the synthesis of the prepropeptide precursor of 107891. The 107891 prepropeptide consists of a leader peptide, 33-aa long, and of a 24-aa propeptide. The nucleic acid sequences referred to herein are those encoding the 107891 prepropeptide or fragments thereof. The 57-aa 107891 prepropeptide represents a novel element that shows only, over its entire length, 41% identity with UniProt accession number P21838, the prepropeptide of the lantibiotic gallidermin from Staphylococcus gallinarum.

Other genes present in the mlb cluster represent novel genetic elements useful for increasing production of 107891 or for synthesizing novel metabolites. Among these, mlb ORFs 7 to 8 (SEQ ID NO: 8 through 9) encode the proteins involved in post-translational modification of the translation product of mlb ORF6, to introduce the lanthionine residues in mature 107891. In particular, the mlb ORF7 polypeptide is responsible for dehydration of the Ser and Thr residues in the prepropeptide portion of 107891 to generate dehydroalanine and dehybutyrine residues, respectively. The mlb ORF8 polypeptide catalyzes the nucleophilic attack of cysteine residues within the prepropeptide onto the dehydro amino acid residues. These genes can be cloned and expressed in a heterologous host to yield active enzymes capable of introducing lanthionine residue to other prepropeptides.

Yet other preferred nucleic acid molecules of the present invention include mlb ORF9 (SEQ ID NO: 10) that encodes a protein involved in the oxidative decarboxylation yielding the S—[(Z)-2-aminovinyl]-(3S)-3-methyl-D-cysteine residue (Formula I).

Yet other preferred nucleic acid molecules of the present invention include mlb ORF16 (SEQ ID NO: 17) that encodes a tryptophan halogenase, responsible for the addition of a chlorine atom to amino acid 4 of 107891. mlb ORF16 represents a novel and unique genetic element, previously not reported in other lantibiotic clusters. Indeed, chlorination is a rather unique feature of 107891 among known lantibiotics. This gene can be cloned and expressed in a heterologous host to yield an active enzyme capable of chlorinating tryptophan residues of lantibiotic molecules. Alternatively, mlb ORF16 can be inactivated in the producing strain, resulting in the formation of 107891 derivatives devoid of the chlorine attached to amino acid 4.

Yet other preferred nucleic acid molecules of the present invention include mlb ORF2 (SEQ ID NO: 3 that encodes a cytochrome P450 hydroxylase responsible for post-translational modification of the 107891 prepropeptide, by the addition of one or two oxygen(s) to the proline residue at position 14. mlb ORF2 represents a novel and unique genetic element, that has never been reported in other lantibiotic clusters. Indeed, the 107891 proline hydroxylation profile is rather unique among lantibiotics. This gene can be cloned and expressed in a heterologous host to yield an active enzyme capable of oxidizing proline residues present in a lantibiotic molecule. Alternatively, mlb ORF2 can be inactivated in the producing strain, resulting in the formation of 107891 derivatives devoid of oxygen atoms at amino acid 14.

The mlb cluster also includes a number of regulatory genes, responsible for activating, directly or indirectly, the expression of biosynthesis and resistance genes during 107891 production. These genes include mlb ORFs 3 and 5 (SEQ ID NOS: 4 and 6): mlb ORF3 (SEQ ID NO: 4) represents a separate quorum-sensing peptide, responsible, in part, for the regulation of 107891 production; and mlb ORF5 (SEQ ID NO: 6) is highly related to Sigma-70, an extracytoplasmic function family of RNA polymerase sigma factors that act as positive transcriptional regulator. mlb ORFs 3 and 5 represent novel genetic elements, absent from other lantibiotic clusters. The two genes, mlb ORF 3 and 5, can be cloned and expressed, either individually or in any combination of them, in another lantibiotic producer strains to increase the yield of product formed.

Host strains include but are not limited to strains belonging to the order Actinomycetales, to the families Streptosporangiaceae, Micromonosporaceae, Pseudonocardiaceae and Streptomycetaceae, to the genera Microbispora, Actinoplanes, Planomonospora, Streptomyces and the like. Alternatively, these genes can be overexpressed, individually or in any combination of them, in the 107891 producing strain to increase the yield of 107891.

The mlb cluster also includes a number of genes responsible for exporting lantibiotic intermediates or finished products out of the cytoplasm and for conferring resistance to the producer cell. These genes include mlb ORFs 1, 10 to 11, 13 to 14 and 17 (SEQ ID NOS: 2, 11 to 12, 14 to 15 and 18). mlb ORFs 1, 10 to 11, 13 to 14 encode transporters of the ABC class, responsible for the ATP-dependent excretion of 107891 or its intermediates. mlb ORF17 encodes an Na/K ion-antiporter, responsible for exporting 107891 or its intermediates against a proton gradient. These genes can be cloned and expressed, either individually or in any combination of them, in another lantibiotic producer strain to increase the yield of product formed. Host strains include but are not limited to strains belonging to the order Actinomycetales, to the families Streptosporangiaceae, Micromonosporaceae, Pseudonocardiaceae and Streptomycetaceae, to the genera Microbispora, Actinoplanes, Planomonospora, Streptomyces and the like.

Alternatively, these genes can be overexpressed, individually or in any combination of them, in the 107891 producing strain to increase the yield of 107891.

C. Uses of the Mlb Cluster

The present invention provides also nucleic acids for the expression of the entire 107891 molecule, any of its precursors or a derivative thereof. Such nucleic acids include isolated gene cluster(s) comprising ORFs encoding polypeptides sufficient to direct the assembly of 107891. In one example, the entire mlb cluster (SEQ ID NO: 1) can be introduced into a suitable vector and used to transform a desired production host. In another aspect, the mlb cluster is cloned as two separate segments into two distinct vectors, which can be compatible in the desired production host. In yet another aspect, the mlb cluster can be subdivided into three segments, each cloned into a separate, compatible vector. Examples of the use of one-, two- or three-vector systems have been described in the literature.

Once the mlb cluster has been suitably cloned into one or more vectors, it can be introduced into a number of suitable production hosts, where production of lantibiotics might occur with greater efficiency than in the native host. Preferred host cells are those of species or strains that can efficiently express actinomycetes genes. Such host include but are not limited to Actinomycetales, Streptosporangiaceae, Micromonosporaceae, Pseudonocardiaceae and Streptomycetaceae, Microbispora, Actinoplanes, Planomonospora and Streptomyces and the like. Alternatively, a second copy of the mil) cluster, cloned into one or more suitable vectors, can be introduced the 107891 producing strain, where the second copy of mlb genes will increase the yield of 107891.

The transfer of the producing capability to a well characterized host can substantially improve several portions of the process of lead optimization and development: the titer of the natural product in the producing strain can be more effectively increased; the purification of the natural product can be carried out in a known background of possible interfering activities; the composition of the complex can be more effectively controlled; altered derivatives of the natural product can be more effectively produced through manipulation of the fermentation conditions or by pathway engineering.

Alternatively, the biosynthetic gene cluster can be modified, inserted into a host cell and used to synthesize or chemically modify a wide variety of metabolites: for example the open reading frames can be re-ordered, modified and combined with other lantibiotic biosynthesis gene cluster.

Using the information provided herein, cloning and expression of 107891 nucleic acids can be accomplished using routine and well known methods.

In another possible use, selected ORFs from the mlb gene cluster are isolated and inactivated by the use of routine molecular biology techniques. The mutated ORF, cloned in a suitable vector containing DNA segments that flank said ORF in the Microbispora sp, PTA-5024 chromosome, is introduced into said Microbispora strain, where two double cross-over events of homologous recombination result in the inactivation of said ORF in the producer strain. This procedure is useful for the production of precursors or derivatives of 107891 in an efficient manner.

In another possible use, selected ORFs from the mlb gene cluster are isolated and placed under the control of a desirable promoter. The engineered ORF, cloned in a suitable vector, is then introduced into Microbispora sp. PTA-5024, either by replacing the original ORF as described above, or as an additional copy of said ORF. This procedure is useful for increasing or decreasing the expression level of ORFs that are critical for production of the 107891 molecule, precursors or derivatives thereof.

Experimental Section

The following examples serve to illustrate the principles and methodologies through which the 107891 gene cluster is identified and the principles and methodologies through which all the mlb genes are identified and analyzed. These examples serve to illustrate the principles and methodologies of the present invention, but are not meant to limit its scope.

General Methods

Unless otherwise indicated, bacterial strains and cloning vectors can all be obtained from public collections or commercial sources. Standard procedures are used for molecular biology. Microbispora was grown in HT agar and in V6 medium (20 g/l glucose, 5 g/l yeast extract, 3 g/l casein hydrolysate, 5 g/1 meat extract, 5 g/l peptone, 1.5 g/l NaCl, 0.5% glycerol). Lantibiotics are isolated following published procedures. Sequence analyses are performed using standard programs. Database searches are performed with Blast or Fasta programs at public sites.

EXAMPLE 1 Isolation of 107891 Biosynthesis Genes

A genomic library is made with DNA from Microbispora sp. PTA-5024 in the conjugative cosmid vector Supercos 3. This was constructed by insertion of the aaclV-oriT-intΦC31 cassette from pSET152 into Supercos 1 (Stratagene, La Jolla, Calif. 92037) as follows: the aacW-oriT-intΦC31 cassette is obtained by PCR as a NruI fragment, 3.8 kb long, from the vector pSET152 and inserted into the same site of supercos1. Total DNA from Microbispora sp. PTA-5024 is partially digested with Sau3AI in order to optimize fragment sizes in the 40 kb range. The partially digested DNA is treated with alkaline phosphatase and ligated to Supercos3 previously digested with BamHI. The ligation mixture is packaged in vitro and used to transfect E. coli XL1Blue cells. The resulting cosmid library is screened by hybridization with the oligonucleotide probe 5′-GTS ACS WSS TGG WSS YTS WSS ACS GGS CCS TGC ACS WSS CCS GGS GGS WSS AAC WSS WSS TCC WSS TG-3′ (SEQ ID NO: 19). The oligonucleotide is designed from the amino acid sequence deduced from the structure of 107891. Two cosmids positive to this probe are isolated and physically mapped with restriction enzymes. From such experiments, the cosmids reported in FIG. 1 are identified. The segment thus identified from the genome of Microbispora sp. PTA-5024 contains the mlb gene cluster responsible for the synthesis of the antibiotic 107891.

The above example serves to illustrate the principle and methodologies through which the mlb cluster can be isolated. It will occur to those skilled in the art that the mlb cluster can be cloned in a variety of vectors. However, those skilled in the art understand that, given the 20-kb size of the mlb cluster, preferred vectors are those capable of carrying large inserts, such as lambda, cosmid and BAC vectors. Those skilled in the art understand that other probes can be used to identify the mlb cluster from such a library. From the sequence reported in SEQ ID NO: 1, any fragment can be PCR-amplified from Microbispora sp. PTA-5024 DNA and used to screen a library made with such DNA. One or more clones from said library can be identified that include any segment covered by SEQ M NO: 1. Furthermore, it is also possible to identify the mlb cluster through the use of heterologous probes, such as those derived from other lan cluster, using the information provided in Table 1. Alternatively, other gene clusters directing the synthesis of secondary metabolites contain genes sufficiently related to the mlb genes as to allow heterologous hybridizations. All these variations fall within the scope of the present invention.

EXAMPLE 2 Sequence Analysis of 107891 Gene Cluster

The mlb cluster, identified as described under Example 1, is sequenced by the shotgun approach. The sequence of the mlb cluster is provided herein as SEQ ID NO: 1. The resulting DNA sequence is analyzed to identify likely coding sequences, which are compared against other lan clusters or searched against GenBank. The exact start codon for each ORF is established by multiple alignment of related sequences or by searching for an upstream ribosomal binding site. In total, 17 ORFs, denominated mlb ORF1 through ORF17, are identified. The results of these analyses are summarized in Table 1, and provided herein in the sequence listing as SEQ ID No: 2 through SEQ ID No: 18. Details are given below.

2A. Synthesis of the 107891 Prepropeptide

mlb ORFs 6 is responsible for the synthesis of prepropeptide. The prepropeptide contains a 49-aa leader sequence and a 24-aa propeptide (SEQ ID NOS: 7) which is post-translationally modified to produce the mature lantibiotic. Two common features of lantibiotic leader peptides are preserved in 107891: the conserved sequence of Type-A (antibiotics (e.g. the F-D/N-L-D/E motif) and the proline residue at position-2. The C-terminal part of the prepropeptide (SEQ ID NOS: 7) is in agreement with the published 107891 primary structure and its proposed propeptide sequence.

2B. Post-Translationally Modification of the 107891 Propeptide

Four proteins, encoded by mlb ORFs 7 through 9 (SEQ ID NOS: 8 through 10) and mlb ORF 17 (SEQ ID NOS: 18) are involved in the post-translational modification of the 107891 prepropeptide. Homologs of these gene products are found in many lantibiotic clusters. On the basis of the sequence identities with the dehydratases and cyclases found in other lantibiotic clusters, and their roles, the following predictions can be made. The mlb ORF7 polypeptide is responsible for dehydration of the serine residues at positions 3, 5, 13, 18 and 21 and of the threonine residue at position 2 and 8 of the 107891 propeptide, to generate the corresponding dehydrated residues. The mlb ORF8 polypeptide catalyzes the regio- and stereospecific conjugate addition of the cysteine residues present in the 107891 propeptide to four dehydroalanine and one dehydrobutyrine residues to generate the corresponding five thioethers. Specifically, the mlb ORF8 polypeptide is involved in the formation of the 3-7, 13-20, 18-23 and 21-24 lanthionines and of the 8-11 methylanthionine.

On the basis of the sequence identities observed with the decarboxylases encoded by the epidermin and mersacidin clusters, mlb ORF9 encodes the enzyme responsible for the decarboxylation of the 21-24 lanthionine moiety. On the basis of the sequence identities observed with the flavin reductase encoded by other antibiotic clusters, mlb ORF16 encoded a flavoprotein reductase. Considering the roles predicted for oxidative decarboxylation during epidermin and mersacidin formation, the mlb ORFs 9 and 16 polypeptides catalyze the formation of the 5-[(Z)-2-aminovinyl]-D-cysteine residue present in 107891 (Formula I).

2C. Formation of β-Hydroxyproline and Tryptophan Chlorination

Two proteins, encoded by mlb ORFs 2 and 15 (SEQ ID NOS: 3 and 16) are involved in the addition of one or two β-hydroxyl groups to the proline residue at position 14 and in the chlorination of the tryptophan residue at position 4 of the 107891 propeptide. The mlb ORF2 polypeptide show significant identity to P450 monooxygenases (Table 1) and is involved in hydroxylation of the proline residue. No homologs of mlb ORF2 have been found in other lantibiotic clusters, thereby this gene represents a unique example of a P450 monoxygenase involved in hydroxylation of a lantibiotic molecule. In addition, on the basis of the level of identities with other halogenases, the mlb ORF15 polypeptide is involved tryptophan chlorination and represents a unique example of a halogenase involved in modification of a lantibiotic molecule.

2D. Export and Resistance

Five proteins, encoded by ORFs 1, 10, 11, 13, 14 and 17 (SEQ ID NOS: 2, 11, 12, 14, 15 and 18) are involved in exporting 107891 or its precursor outside the cytoplasm and in conferring resistance to the producing strain. Their predicted roles are as follows.

Homologs of ORF1 (SEQ ID NO: 2) are not found in other lantibiotic clusters. This gene encodes additional ABC-type transporters (Table 1), and is therefore involved in conferring resistance to 107891 in the producing strain Microbispora sp. PTA. Homologs of ORFs 10, 13 to 14 and 17 (SEQ ID NOS: 11, 14 to 15 and 18) are present in other lantibiotic clusters (Table 1). They encode ABC-type and ion-dependent transmembrane transporters, respectively. They are thus involved in export and/or compartimentalization of 107891 or its precursors. mlb ORF17 encodes an Na/K ion-antiporter, responsible for exporting 107891 or its intermediates against a proton gradient.

2E. Regulation

Two proteins, encoded by ORFs 3 and 5 (SEQ ID NOS: 4 and 6), are involved in regulating the expression of one or more of the mlb genes. The mlb ORF5 polypeptide (SEQ ID NO: 6) represents a novel genetic element, homologs of which are not found in the other lantibiotic clusters. This protein belongs to the extracytoplasmic function family of sigma factors that act as positive transcriptional regulators. The ORF3 polypeptide (SEQ ID NO: 4) belongs to the family of LuxR-type transcriptional regulators. ORFs 3 (SEQ ID NOS: 4) is therefore likely to be required for the expression of one or more of the mlb genes.

2F. Additional Functions

Two additional ORFs are present in the mlb cluster: ORF4 (SEQ ID NO: 5) and ORF12 (SEQ ID NO: 13). Both ORFs are related to proteins of unknown function present in Salinispora tropica and Streptomyces ambofaciens, respectively (Table 1). However, their precise role in 107891 biosynthesis cannot be predicted yet.

EXAMPLE 3 Manipulation of the 107891 Pathway by Gene Replacement

Using the information provided in Example 2, an in frame deletion in ORF 2 is constructed as follows. Fragment A was obtained through amplification with oligos 5′-AAGCTTGCATCTGCGTGGGCGTCCTGC-3′ (SEQ ID NO: 20) and 5′-TCTAGACGGTCCGAAGATCATGGCCGCGG-3′ (SEQ ID NO: 21); and fragment B is obtained through amplification with oligos 5′-TCTAGATCCATGTGAACCGGCGGGTGGCCG-3′ (SEQ ID NO: 22) and 5′-GAATTCCGGTCGCTCTCCTCGTCCTTTGCC-3′ (SEQ ID NO: 23)

Next, fragment A is digested with EcoRI and XbaI, fragment B with XbaI and HindIII, and both are ligated to pSET152 previously digested with EcoRI and HindIII. After transformation of E. coli DH5α cells, the resulting plasmid, designated pDM1, is recognized by the presence of fragments of 4 kb and 1.5 kb after digestion with EcoRI and HindIII. An aliquot of pDM1 is transferred into E. coli ET12567(PUB307) cells, yielding strain DM1. Then, about 10⁸ CFU of DM1 cells, from an overnight culture in LB, are mixed with about 10⁷ CFU of Microbispora PTA 5024 grown in Rare3 medium for about 80 h. The resulting mixture is spread onto HT plates, which are then incubated at 28° C. for about 20 h. After removing excess E. coli cells with a gentle wash with water, plates are overlaid with 3 ml soft agar containing 200 μg nalidixic acid and 15 μg/ml apramycin. After further incubation at 28° C. for 3-5 weeks, Microbispora ex-conjugants are streaked onto fresh medium containing apramycin. One such ex-conjugant, named strain Mb-DM1, is further processed. Strain Mb-DM1 is then grown for several passages in HT medium without apramycin and appropriate dilutions are plated on HT agar without apramycin. Individual colonies are then analyzed by PCR, using oligos 5′-CGCGCTGCTCGGGGCCAAC-3′ (SEQ ID NO: 24) and 5′-AGGAAACGGCCAGCCCGTGG-3′ (SEQ ID NO: 25).

Colonies containing the deleted allele of ORF2 are recognized by the presence of a 1.5 kb band. One such colony, designated Mb-DM2, is grown in HT medium and the formation of dehydroxyl-107891 is confirmed by comparison with an authentic standard.

The above example serves to illustrate the principle and methodologies through which an ORF chosen among any of those specified by SEQ ID NOS: 2 to 18 can be replaced by a mutated copy in the 107891 producing strain Microbispora sp. PTA 5024. It will occur to those skilled in the art that ORF2 (SEQ ID NO: 3) is just an example of the methodologies for creating in frame deletions in the cluster specified by SEQ ID NO: 1.

Those skilled in the art understand also that in frame-deletions are just one method for generating mutations, and that other methods including but not limited to frame-shift mutations, insertions and site-directed mutations can also be used to generate null mutants in any of the ORFs specified by SEQ ID NOS: 2 to 18

Those skilled in the art also understand that, having established a method for generating mutations in any of the ORFs specified by SEQ ID NOS: 1, these same methodologies can be applied for altering the expression levels of these same ORFs. Examples for how this can be achieved include but are not limited to integration of multiple copies of said ORFs into any place in the Microbispora sp. PTA 5024 genome, alteration in the promoters controlling the expression of said ORFs, removal of antisense RNAs or transcription terminators interfering with their expression.

Finally, variations in the vectors used for introducing the mutated alleles into Microbispora sp. PTA5024, in the conditions for conjugation and cultivation of the donor and recipient strain, in the method for selecting and screening ex-conjugants and their derivatives, all fall within the scope of the present invention.

EXAMPLE 4 In Vitro Halogenation of a Lantibiotic

Using the information provided in Example 2, mlb ORF15 (SEQ ID NO: 16) is overexpressed in E. coli as follows. A 1.6 kb fragment, obtained by amplification with oligos 5′-TTTTTCATATGGGTGGGAGTGATCGGCGGCG-3′ (SEQ ID NO: 26) and 5′-TTTTTGTCGACCTACTGCTGGCCGCGGTCCGGACT-3′ (SEQ ID NO: 27), is digested with NcoI and SalI and ligated to pET22b, previously digested with NcoI and XhoI. After transformation of E. coli DH5a cells, the resulting plasmid pHAL, recognizable for the presence of fragments of 5.5 kb and 1.6 kb after digestion with NdeI and XhoI, is introduced into E. coli BL21(DE3) cells. Cultures of E. coli BL21(DE3) cells harbouring pHAL are grown at 20° C. in LB to an OD600 of 0.6. Then, IPTG is added to 1 mM and cells grown for further 6 h. Cells are harvested, ruptured by sonication and the His-tagged ORF15 polypeptide recovered from a Ni-agarose column. In vitro halogenation of substrates, such as lacticin 481 or 97518 is carried out and formation of the chlorine-derivative of the lantibiotic is established by MS analysis.

The above example serves to illustrate the principle and methodologies through which an ORF chosen among any of those specified by SEQ ID NOS: 2 to 18 can be overexpressed in a convenient host, the resulting enzyme overproduced and used for transforming a lantibiotic natural product into a different compound. It will occur to those skilled in the art that ORF15 (SEQ ID NO: 16) is just used as an example of the methodologies for overproducing any polypeptide encoded by SEQ ID NO: 1. Those skilled in the art understand also that other methods for overproducing proteins, including but not limited to the use of different affinity tags, the use of different vectors, of different host strains, and of methods of induction, can also be used to overproduce the polypeptides specified by ORFs 1 to 17 (SEQ ID NOS: 2 to 18).

It will occur to those skilled in the art that other lantibiotic substrates containing or not triptophan residue, can be used for adding chlorine atoms by ORF15 (SEQ ID NO: 16). It will also occur to those skilled in the art that other ORF polipeptides specified by mlb cluster (SEQ ID NO: 1) can be used for the post-translational modification of other lantibiotic prepropeptides. ORF polipeptides that can be used for this purpose include but are not limited to ORF2 (SEQ ID NO 3). Specifically, ORFs 7 and 8 (SEQ ID NO 8 and 9) can be used to dehydrate and introduce thioether bridges in other lantibiotic prepropeptides; ORFs 9 and 16 (SEQ ID NO 10 and 17) can be used to decarboxylate other lantibiotic containing a C-terminal Cys residue; and ORF 2 (SEQ ID NO 3) can be used to hydroxylate other proline containing lantibiotics.

EXAMPLE 5 Expression of the mlb Cluster in a Heterologous Host

Using the information provided in Examples 1 and 2, cosmid 1G6 (FIG. 1) containing the entire mlb cluster (SEQ JD NO: 1) is introduced into Streptomyces albus by conjugation as described by Kieser et al., 2000. Apramycin resistant exconjugants are grown under appropriate conditions and 107891 is purified as described.

Those skilled in the art understand also that S. albus is just one producer strain, and that other strains can also be used for introducing the entire mlb cluster (SEQ ID NO: 1) and for the production of 107891. Preferred host cells are those of species or strains that can efficiently express actinomycete genes. Such host include but are not limited to Actinomycetales, Streptosporangiaceae, Micromonosporaceae, Pseudonocardiaceae and Streptomycetaceae, Microbispora, Actinoplanes, Planomonospora and Streptomyces and the like. Those skilled in the art understand that, provided that suitable promoters are placed in front of mlb operons, production hosts need not to be limited to those cells that can efficiently expressed actinomycete genes. Suitable production hosts of this latter category can be found among those that are easy to manipulate genetically or among those that naturally produce other lantibiotics. Examples of such production hosts include but not limited to Escherichia coli and related species, Bacillus spp., Streptococcus spp, Lactobacillus spp., Staphylococcus spp., and the like.

Those skilled in the art understand also that using the methodologies described in this example, the mlb cluster can be introduced as a second copy into the original 107891 producer strain Microbispora sp. PTA 5024 where the second copy of the mlb genes will increase the yield of 107891.

Those skilled in the art understand also that different vectors, different methodologies for introducing the mlb cluster, different methods of selecting recombinant clones carrying the mlb cluster, different methods and conditions for growing said recombinant clones and different methods for detecting 107891 production, can be effective for expression of the mlb cluster in a heterologous host.

EXAMPLE 6 Generation of a Library of 107891 Variants

Using the information provided in Examples 2, 3 and 5, mlb ORF 6 (SEQ ID NO: 7) is modified to produce variants of 107891 as follows. First, a vector carrying an in frame deletion in ORF 6 (SEQ ID NO: 7) is constructed according to the methodologies of Example 3 utilizing OLIGOS 5′-GCAGCCAGGCTCGCACCGGC-3′ and CGCCCGTAACGAGCGA (SEQ ID NO 28 and 29) for fragment A and OLIGOS 5′-GCAGCTTCTGCTGCTGA-3′ and 5′-TCCCGGCCAGCCACTT-3′ (SEQ ID NO 30 and 31) for fragment B to amplify ORF 6. The resulting construct is used to replace ORF 6 in the S. albus strain obtained as described on Example 5 according to the methodologies of Example 3 and generating SA-D6 strain. Next, the prepeptide portion of mlb ORF6 (SEQ ID NO: 7), obtained by amplification with oligos 5′-CCGGAAAGGAGCGAGCATATG-3′ (SEQ ID NO: 32) and 5′-CAGATCTGCCAATACAGT-3′ (SEQ ID NO: 33), is digested with NdeI and BglII and ligated to pIJ8600 (Kieser et al., 2000), previously digested with NdeI and BglII generating plasmid pPRE1. In a parallel experiment oligos A 5′-C GGT GTC GAG GAG ATC ACC GCC GGG CCG GCG NNN NNN AGC NNN NNN NNN TGC ACC NNN NNN TGC NNN AGC NNN NNN NNN NNN AGC NNN TGC AGC NNN TGC TOG TGA AGA TCT-3′ and oligo B 5′-T TCA GCA GCA NNN GCT GCA NNN GCT NNN NNN NNN NNN GCT NNN GCA NNN NNN GGT GCA NNN NNN NNN GCT NNN NNN CGC CGG CCC GGC GGT GAT CTC CTC GAC ACC GAT CGA-3′

(SEQ ID 34 and 35) are denatured and annealed to generate a mixture of DNA segments (SEQ ID NO: 36) which encodes polypeptides with all possible changes in the amino acid sequence of the propeptide region of the ORF6 polipeptide expect for those involved in thioether formation (specifically amino acid 3, 7, 8, 11, 13, 18, 20, 21, 23 and 24). This mixture of DNA segments is ligated to plasmid pPRE1, previously digested with BSA OI and BglII, to generate a library of plasmids carrying all possible variant forms (expect for the (methyl)lanthionine bridges) of the ORF 6 segment encoding the propeptide fused to the ORF 6 segment encoding the leader peptide. This plasmid library is introduced into SA-D6 strain and the resulting exconjugants, grown in the presence of μg/ml thiostrepton, are screened for the production of 107891 variants by biological assays or HPLC analysis. Interesting variants are further characterized by sequencing the propeptide portion of the ORF 6 variant and by structure elucidation.

Those skilled in the art understand that the leader peptides portion of the ORF 6 polipeptide can also be modified insofar as the enzymes involved in post-translational modifications (SEQ ID NO: 8 and 9) can recognize said different leader peptides. Variations in the leader peptide portion of the ORF 6 polipeptide fall within the scope of this invention.

It will occur to those skilled in the art that other method can be used for constructing a library of ORF 6 variants, including but not limited to the use of different oligos, vectors, induction system and host strains. Furthermore those skill in the art understand that methyllanthionine residues can be replaced by lanthionine residues, and viceversa, to generate additional ORF 6 variants. Those skill in the art understand also that site directed mutagenesis can be used to generate selected variants of ORF 6 resulting in the production of specified derivatives of 107891. 

The invention claim is:
 1. A recombinant DNA vector which comprises the nucleotide sequence SEQ ID NO:1.
 2. A host cell transformed with the vector of claim
 1. 3. The transformed host cell of claim 2 which belongs to the order Actinomycetales.
 4. The transformed host cell of claim 3 which belongs to the family selected from the group consisting of Streptosporangiaceae, Micromonosporaceae, Pseudonocardiaceae and Streptomycetaceae.
 5. The transformed host cell according to claim 4 which belongs to the genera selected from the group consisting of Microbispora, Actnopanes, Planomonospora, and Streptomyces.
 6. The transformed host cell of claim 2 which belongs to the order Bacillales.
 7. The transformed host cell of cam 6 which belongs to the family selected from the group consisting of Bacillaceae, Lactobacillaceae, Streptococcaceae and Staphylococcaceae.
 8. The transformed host cell of claim 7 which belongs to the genera selected from the group consisting of Bacillus, Lactococcus, Streptococcus, and Staphylococcus.
 9. The transformed host cell of claim 2 which belongs to the species Escherichia coli.
 10. A method for ncreasng productbn of 107891 by a microorgansm capable of produdng 107891, said method comprising: transforming with the recombinant DNA vector of claim 1 a microorganism that produces 107891, wherein said DNA vector is an expression vector. 